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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANTS HOFVANDER, Per 
PERSSON, Per T 
WIRSTROM, Olle 
TALLBERG, Anne 11 

(ii) TITLE OF INVENTION: GENETICALLY ENGINEERED MODIFICATION OF 
POTATO TO FORM AMYLOPECT IN-TYPE STARCH 

(111) NUMBER OF SEQUENCES: 21 

(Iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Burns, Doane, Swecker & Mat his 

(B) STREET: George Mason Bldg. , Washington & Prince Sts. 

(C) CITY: Alexandria 

(D) STATE: Virginia 

(E) COUNTRY: United States 

(F) ZIP: 22313-1404 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible Of-i^fl-im 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS iHcUtlVFri 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 2 ^ 

0 2 199? 

(vl) CURRENT APPLICATION DATA: ^p^^ 

(A) APPLICATION NUMBER: US 08/070,455 i^oOUP 1 ftHn 

(B) FILING DATE: 09-JUN-1993 ■ OUU 

(C) CLASSIFICATION: 

(vlll) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Crane-Feury, Sharon E 

(B) REGISTRATION NUMBER: 36,113 

(C) REFERENCE /DOCKET NUMBER: 003300-293 

(Ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (703) 836-6620 

(B) TELEFAX: (703) 836-2021 



(2) INFORMATION FOR SEQ ID NO:l: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 342 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSs double 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 217.. 342 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TGCATGTTTC CCTACATTCT ATTTAGAATC GTGTTGTGGT GTATAAACGT TGTTTCATAT 60 

CTCATCTCAT CTATTCTGAT TTTGATTCTC TTGCCTACTG TAATCGGTGA TAAATGTGAA 120 

T6CTTCCTTT CTTCTCAGAA ATCAATTTCT GTTTTGTTTT TGTTCATCTG TAGCTTATTC 180 



337 
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TCTGGTAGAT TCCCCTTTTT GTAGACCACA CATCAC ATG GCA AGC ATC ACA GCT 

Met Ala Ser lie Thr Ala 
1 5 



234 



TCA 
Ser 



CAC CAC TTT GTG TCA AGA AGC CAA ACT TCA CTA GAC ACC AAA TCA 
His His Phe Val Ser Arg Ser Gin Thr Ser Leu Asp Thr Lys Ser 
10 15 20 



282 



ACC 
Thr 



TTG TCA CAG ATA GGA CTC AGG AAC CAT ACT CTG ACT CAC AAT GGT 
Leu Ser Gin lie Gly Leu Arg Asn His Thr Leu Thr His Asn Gly 
25 30 35 



330 



TTA 



AGG GCT GTT 



342 



Leu Arg Ala Val 
40 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



AACAAGCTTG 


ATGGGCTCCA 


ATCAACAACT 


AATACTAAGG 


TAACACCCAA 


GATGGCATCC 


60 


AGAACTGAGA 


CCAAGAGACC 


TGGATGCTCA 


GCTACCATTG 


TTTGTGGAAA 


GGGAATGAAC 


120 


TTGATCTTTG 


TGGGTACTGA 


GGTTGGTCCT 


TGGAGCAAAA 


CTGGTGGACT 


AGGTGATGTT 


180 


CTTGGTGGAC 


TACCACCAGC 


CCTTGCAGTA 


AGTCTTTCTT 


TCATTTGGTT 


ACCTACTCAT 


240 


TCATTACTTA 


TTTTGTTTAG 


TTAGTTTCTA 


CTGCATCAGT 


CTTTTTATCA 


TTTAGGCCCG 


300 


CGGACATCGG 


GTAATGACAA 


TATCCCCCCG 


TTATGACCAA 


TACAAAGATG 


CTTGGGATAC 


360 


TGGCGTTGCG 


GTTGAGGTAC 


ATCTTCCTAT 


ATTGATACGG 


TACAATATTG 


TTCTCTTACA 


420 


TTTCCTGATT 


CAAGAATGTG 


ATCATCTGCA 


GGTCAAAGTT 


GGAGACAGCA 


TTGAAATTGT 


480 


TCGTTTCTTT 


CACTGCTATA 


AACGTGGGGT 


TGATCGTGTT 


TTTGTTGACC 


ACCCAATGTT 


540 


CTTGGAGAAA 


GTAAGCATAT 


TATGATTATG 


AATCCGTCCT 


GAGGGATACG 


CAGAACA6GT 


600 


CATTTTGAGT 


ATCTTTTAAC 


TCTACTGGTG 


CTTTTACTCT 


TTTAAGGTTT 


GGGGCAAAAC 


660 


TGGTTCAAAA 


ATCTATGGCC 


CCAAAGCTGG 


ACTAGATTAT 


CTGGACAATG 


AACTTAGGTT 


720 


CAGCTTGTTG 


T6TCAAGTAA 


GTTAGTTACT 


CTTGATTTTT 


ATGTGGCATT 


TTACTCTTTT 


780 


GTCTTTAATC 


GTTTTTTTAA 


CCTTGTTTTC 


TCAGGCAGCC 


CTAGAGGCAC 


CTAAAGTTTT 


840 


GAATTTGAAC 


AGTAGCAACT 


ACTTCTCAGG 


ACCATATGGT 


AATTAACACA 


TCCTAGTTTC 


900 


AGAAAACTCC 


TTACTATATC 


ATTGTAGGTA 


ATCATCTTTA 


TTTTGCCTAT 


TCCTGCAGGA 


960 


GAGGATGTTC 


TCTTCATTGC 


CAATGATTGG 


CACACAGCTC 


TCATTCCTTG 


CTACTTGAAG 


1020 


TCAATGTACC 


AGTCCAGAGG 


AATCTACTTG 


AATGCCAAGG 


TAAAATTTCT 


TTGTATTCAC 


1080 


TCGATTGCAC 


GTTACCCTGC 


AAATCAGTAA 


GGTTGTATTA 


ATATATGATA 


AATTTCACAT 


1140 
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TGCCTCCAGG 


TTGCTTTCTG 


CATCCATAAC 


ATTGCCTACC 


TV TV ^ ^ fjyr^f^ TV mm 
AAGGTCGATT 


TTCTTTCTCT 


XzUU 


GACTTCCCTC 


TTCTCAATCT 


TCCTGATGAA 


T T CAGGGGTT 


CTTT TGAT T T 


CAT IGAIVtVtVt 


1 0 cn 
X^oU 


TATGTATTTA 


TGCTTGAAAT 


CAGACCTCCA 


ACTTTTGAAG 


CTCTTTTGAT 


^^rri TV ^ rf* TS TS IS rp 

GCTAGTAAAT 




TGAGTTTTTA 


AAATTTTGCA 


GATATGAGAA 


^ ^^tm^ f mil IS IS 

GCCTGTTAAG 


GGTAGGAAAA 


T CAAC 1 (7V7A 1 


X JoU 


GAAGGCTGGG 


ATATTAGAAT 


CACATAGGGT 


GGTTACAGTG 


ACvCCCA 1 AO 1 


a T/^ oor* a a r* a 
A X IfOO WAAlv A 


X44 w 


ACTTGTCTCT 


GCTGTTGACA 




Al IvKvAWAlvX 






1 ROD 


AACTGGGATT 


GTGAATGGCA 


TGGATACACA 


1\ -TV ^ ^ IS TS ^ 

AGAGTGGAAC 


CCAGCGAC 1 V 


a^aaa'pa^ao 
AV^ AAA X AOnO 


X3 OsJ 


AGATGTCAAA 


TACGATATAA 


CCACTGTAAG 


ATAAGATTTT 


m^^^ TS ^^rp/^/^ TV 

TCCGACTCCA 


GxAXAX ACX A 


xozu 


AATTATTTTG 


TATGTTTATG 


AAATTAAAGA 


GTTCTTGCTA 


TS m^is TS TS IS moff 
ATCAAAATCT 


\^ X AX AOAiria- X 


xoou 


CATGGACGCA 


AAACCTTTAC 


TAAAGGAGGC 


TCTTCAAGCA 


r^/^TS/^T'T^^O/^'P 

GCAGxTCKvLr 1 


XivOL^XwX XVvA 


X /ftU 


CAAGAAGATC 


CCTTTGATTG 


GCTTCATCGG 


CAGACTTGAG 


^ TV ^ ^TS TV TS TS ^ 

GAGCAGAAAG 


^cp'P^a^a'pa'T* 
GxTCAvAXAX 


XOULr 


TCTTGTTGCT 


GCAATTCACA 


AGTTCATCGG 


ATTGGATGTx 


CAAAl XV7 1 Au 


m m rp|-» iTi JS a n 

XOOX XVrXAriu^ 


XOO\J 


TACCAAATGG 


ACTCATGGTA 


TCTCTCTTGT 


TGAGTTTACT 


m/^ r*r^i^ ts ts ts 
XGTGCCGAAA 


/-irp/^ a a a T»rp/-i ts 

CXCvAAAX XVrA 


X 


CCTGCTACTC 


ATCCTATGCA 


TCAGGGAACT 


^ ^ ^TS TS TV 7S 7S ^ ^ 

GGCAAAAAGG 


AGX 1 IGAVrCA 




X70w 


CAGCTCGAAG 


TGTTGTACCC 


TAACAAAGCT 


TS TS TS ^ TV ^ rjl^^ ^ 

AAAGGAGTGG 


r*)YSTS1S1srprp^7S7S 

C AAAAT T C AA 


ICvXCCCX X X(t 




GCTCACATGA 


TCACTGCTGG 


TGCTGATTTT 


ATGTTGGTTC 


^ TS TV O /^7S TV 'P'P 

CAAGLtAIyAI 1 


X V7AAOO X X ^ X 


•5 1 on 

Z XUw 


GGTCTCATTC 


AGTTACATGC 


TATGCGATAT 


GGAACAGTAA 


^ TS TS ^^>TS TS TS 

GAACCAGAAG 


AGCTTGTACC 


^ XDU 


TTTTTACTGA 


GTTTTTAAAA 


AAAGAATCAT 


TS TS ^ TS nii^n^^n 

AAGACCTTGT 


TTTCCATCTA 


a a 0 rprprp TS a vp a 
AAGTTTAAxA 


n n on 


ACCAACTAAA 


TGTTACTGCA 


GCAAGCTTTT 


^7S miPfP^^^ TS TV 

CAT TTCTGAA 


AATTGGTTAT 


^m^ IS mrprpiTiis a 
k.* X V A X X X X AA 




CGTAATCACA 


TGTGAGTCAG 


GTACCAATCT 


GTGCATCGAC 


TGGTGGACTT 


GTTGACACTG 


2340 


TGAAAGAAGG 


CTATACTGGA 


TTCCATATGG 


GAGCCTTCAA 


TGTTGAAGTA 


TGTGATTTTA 


2400 


CATCAATTGT 


GTACTTGTAC 


ATGGTCCATT 


CTCGTCTTGA 


TATACCCCTT 


GTTGCATAAA 


2460 


CATTAACTTA 


TTGCTTCTTG 


AATTTGGTTA 


GTGCGATGTT 


GTTGACCCAG 


CTGATGTGCT 


2520 


TAAGATAGTA 


ACAACAGTTG 


CTAGAGCTC 








2549 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..15 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 101.. 218 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GAG CTC TCC TGG AAG GTAAGTGTGA ATTTGATAAT TTGCGTAGGT ACTTCAGTTT 55 
Glu Leu Ser Trp Lys 
1 5 

GTTGTTCTCG TCAGCACTGA TGGATTCCAA CTGGTGTTCT TGCAG GAA CCT GCC 109 

Glu Pro Ala 
1 

AAG AAA TGG GAG ACA TTG CTA TTG GGC TTA GGA GCT TCT GGC AGT GAA 157 
Lys Lys Trp Glu Thr Leu Leu Leu Gly Leu Gly Ala Ser Gly Ser Glu 
5 10 15 

CCC GGT GTT GAA GGG GAA GAA ATC GCT CCA CTT GCC AAG GAA AAT GTA 205 
Pro Gly Val Glu Gly Glu Glu lie Ala Pro Leu Ala Lys Glu Asn Val 
20 25 30 35 

GCC ACT CCT TAAATGAGCT TTGGTTATCC TTGTTTCAAC AATAAGATCA 254 
Ala Thr Pro * 



TTAAGCAAAC 


GTATTTACTA 


GCGAACTATG 


TAGAACCCTA 


TTATGGGGTC 


TCAATCATCT 


314 


ACAAAATGAT 


TGGTTTTTGC 


TGGGGAGCAG 


CAGCATATAA 


GGCTGTAAAA 


TCCTGGTTAA 


374 


TGTTTTTGTA 


GGTAAGGGCT 


ATTTAAGGTG 


GTGTGGATCA 


AAGTCAATAG 


AAAATAGTTA 


434 


TTACTAACGT 


TTGCAACTAA 


ATACTTAGTA 


ATGTAGCATA 


AATAATACTA 


GAACTAGT 


492 


(2) INFORMATION FOR SEQ ID NO: 4: 










(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 987 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 








(ii) MOLECULE TYPE: DNA (genomic) 








(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 








AAGCTTTAAC 


GAGATAGAAA 


ATTATGTTAC 


TCCGTTTTGT 


TCATTACTTA 


ACAAATGCAA 


60 


CAGTATCTTG 


TACCAAATCC 


TTTCTCTCTT 


TTCAAACTTT 


TCTATTTGGC 


TGTTGACGGA 


120 


GTAATCAGGA 


TACAAACCAC 


AA6TATTTAA 


TTGACTCCTC 


CGCCAGATAT 


TATGATTTAT 


180 


GAATCCTCGA 


AAAGCCTATC 


CATTAAGTCC 


TCATCTATGG 


ATATACTTGA 


CAGTATCTTC 


240 


CTGTTTGGGT 


ATTTTTTTTT 


CCTGCCAAGT 


GGAACGGAGA 


CATGTTATGA 


TGTATACGGG 


300 


AAGCTCGTTA 


AAAAAAAATA 


CAATAGGAAG 


AAATGTAACA 


AACATTGAAT 


GTTGTTTTTA 


360 


ACCATCCTTC 


CTTTAGCAGT 


GTATCAATTT 


TGTAATAGAA 


CCATGCATCT 


CAATCTTAAT 


420 


ACTAAAATGC 


AACTTAATAT 


AGGCTAAACC 


AAGATAAAGT 


AATGTATTCA 


ACCTTTAGAA 


480 


TTGTGCATTC 


ATAATTAGAT 


CTTGTTTGTC 


GTAAAAAATT 


AGAAAATATA 


TTTACAGTAA 


540 


TTTGGAATAC 


AAAGCTAAGG 


GGGAAGTAAC 


TAATATTCTA 


GTGGAGGGAG 


GGACCAGTAC 


600 


CAGTACCTAG 


ATATTATTTT 


TAATTACTAT 


AATAATAATT 


TAATTAACAC 


GAGACATAGG 


660 


AATGTCAAGT 


GGTAGCGTAG 


GAGGGAGTTG 


GTTTAGTTTT 


TTAGATACTA 


GGAGACAGAA 


720 
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CCGGACGGCC CATTGCAAG6 CCAAGTTGAA GTCCAGCCGT GAATCAACAA AGAGAGGGCC 780 

CATAATACTG TCGATGAGCA TTTCCCTATA ATACAGTGTC CACAGTTGCC TTCTGCTAAG 840 

GGATAGCCAC CCGCTATTCT CTTGACACGT GTCACTGAAA CCTGCTACAA ATAAGGCAGG 900 

CACCTCCTCA TTCTCACTCA CTCACTCACA CAGCTCAACA AGTGGTAACT TTTACTCATC 960 

TCCTCCAATT ATTTCTGATT TCATGCA 987 

(2) INFORMATION FOR SEQ ID NO: 5: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4964 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AAGCTTTAAC GAGATAGAAA ATTATGTTAC TCCGTTTTGT TCATTACTTA ACAAATGCAA 60 

CAGTATCTTG TACCAAATCC TTTCTCTCTT TTCAAACTTT TCTATTTGGC TGTTGACGGA 120 

GTAATCAGGA TACAAACCAC AAGTATTTAA TTGACTCCTC CGCCAGATAT TATGATTTAT 180 

GAATCCTCGA AAAGCCTATC CATTAAGTCC TCATCTATGG ATATACTTGA CAGTATCTTG 240 

CTGTTTGGGT ATTTTTTTTT CCTGCCAAGT GGAACGGAGA CATGTTATGA TGTATACGGG 300 

AAGCTCGTTA AAAAAAAATA CAATAGGAAG AAATGTAACA AACATTGAAT GTTGTTTTTA 360 

ACCATCCTTC CTTTAGCAGT GTATCAATTT TGTAATAGAA CCATGCATCT CAATCTTAAT 420 

ACTAAAATGC AACTTAATAT AGGCTAAACC AAGATAAAGT AATGTATTCA ACCTTTAGAA 480 

TTGTGCATTC ATAATTAGAT CTTGTTTGTC GTAAAAAATT AGAAAATATA TTTACAGTAA 540 

TTTGGAATAC AAAGCTAAGG GGGAAGTAAC TAATATTCTA GTGGAGGGAG GGACCAGTAC 600 

CAGTACCTAG ATATTATTTT TAATTACTAT AATAATAATT TAATTAACAC GAGACATAGG 660 

AATGTCAAGT GGTAGCGTAG GAGGGAGTTG GTTTAGTTTT TTAGATACTA GGAGACAGAA 720 

CCGGACGGCC CATTGCAAGG CCAAGTTGAA GTCCAGCCGT GAATCAACAA AGAGAGGGCC 780 

CATAATACTG TCGATGAGCA TTTCCCTATA ATACAGTGTC CACAGTTGCC TTCTGCTAAG 840 

GGATAGCCAC CCGCTATTCT CTTGACACGT GTCACTGAAA CCTGCTACAA ATAAGGCAGG 900 

CACCTCCTCA TTCTCACTCA CTCACTCACA CAGCTCAACA AGTGGTAACT TTTACTCATC 960 

TCCTCCAATT ATTTCTGATT TCATGCATGT TTCCCTACAT TCTATTATGA ATCGTGTTGT 1020 

GGTGTATAAA CGTTGTTTCA TATCTCATCT CATCTATTCT GATTTTGATT CTCTTGCCTA 1080 

CTGTAATCGG TGATAAATGT GAATGCTTCC TTTCTTCTCA GAAATCAATT TCTGTTTTGT 1140 

TTTTGTTCAT CTGTAGCTTA TTCTCTGGTA GATTCCCCTT TTTGTAGACC ACACATCACA 1200 

TGGCAAGCAT CACAGCTTCA CACCACTTTG TGTCAAGAAG CCAAACTTCA CTAGACACCA 1260 

AATCAACCTT GTCACAGATA GGACTCAGGA ACCATACTCT GACTCACAAT GGTTTAAGGG 1320 




- 26 - 



CTGTTAACAA GCTTGATGGG CTCCAATCAA CAACTAATAC TAAGGTAACA CCCAAGATGG 
CATCCAGAAC TGAGACCAAG AGACCTGGAT GCTCAGCTAC CATTGTTTGT GGAAAGGGAA 
TGAACTTGAT CTTTGTGGGT ACTGAGGTTG GTCCTTGGAG CAAAACTGGT GGACTAGGTG 
ATGTTCTTGG TGGACTACCA CCAGCCCTTG CAGTAAGTCT TTCTTTCATT TGGTTACCTA 
CTCATTCATT ACTTATTTTG TTTAGTTAGT TTCTACTGCA TCAGTCTTTT TATCATTTAG 
GCCCGCGGAC AGCGGGTAAT GACAATATCC CCCCGTTATG ACCAATACAA AGATGCTTGG 
GATACTGGCG TTGCGGTTGA GGTACATCTT CCTATATTGA TACGGTACAA TATTGTTCTC 
TTACATTTCC TGATTCAAGA ATGTGATCAT CTGCAGGTCA AAGTTGGAGA CAGCATTGAA 
ATTGTTCGTT TCTTTCACTG CTATAAACGT GGGGTTGATC GTGTTTTTGT TGACCACCCA 
ATGTTCTTGG AGAAAGTAAG CATATTATGA TTATGAATCC GTCCTGAGGG ATACGCAGAA 
CAGGTCATTT TGAGTATCTT TTAACTCTAC TGGTGCTTTT ACTCTTTTAA GGTTTGGGGC 
AAAACTGGTT CAAAAATCTA TGGCCCCAAA GCTGGACTAG ATTATCTGGA CAATGAACTT 
AGGTTCAGCT TGTTGTGTCA AGTAAGTTAG TTACTCTTGA TTTTTATGTG GCATTTTACT 
CTTTTGTCTT TAATCGTTTT TTTAACCTTG TTTTCTCAGG CAGCCCTAGA GGCACCTAAA 
GTTTTGAATT TGAACAGTAG CAACTACTTC TCAGGACCAT ATGGTAATTA ACACATCCTA 
GTTTCAGAAA ACTCCTTACT ATATCATTGT AGGTAATCAT CTTTATTTTG CCTATTCCTG 
CAGGAGAGGA TGTTCTCTTC ATTGCCAATG ATTGGCACAC AGCTCTCATT CCTTGCTACT 
TGAAGTCAAT GTACCAGTCC AGAGGAATCT ACTTGAATGC CAAGGTAAAA TTTCTTTGTA 
TTCACTCGAT TGCACGTTAC CCTGCAAATC AGTAAGGTTG TATTAATATA TGATAAATTT 
CACATTGCCT CCAGGTTGCT TTCTGCATCC ATAACATTGC CTACCAAGGT CGATTTTCTT 
TCTCTGACTT CCCTCTTCTC AATCTTCCTG ATGAATTCAG GGGTTCTTTT GATTTCATTG 
ATGGGTATGT ATTTATGCTT GAAATCAGAC CTCCAACTTT TGAAGCTCTT TTGATGCTAG 
TAAATTGAGT TTTTAAAATT TTGCAGATAT GAGAAGCCTG TTAAGGGTAG GAAAATCAAC 
TGGATGAAGG CTGGGATATT AGAATCACAT AGGGTGGTTA CAGTGAGCCC ATACTATGCC 
CAAGAACTTG TCTCTGCTGT TGACAAGGGA GTTGAATTGG ACAGTGTCCT TCGTAAGACT 
TGCATAACTG GGATTGTGAA TGGCATGGAT ACACAAGAGT GGAACCCAGC GACTGACAAA 
TACACAGATG TCAAATACGA TATAACCACT GTAAGATAAG ATTTTTCCGA CTCCAGTATA 
TACTAAATTA TTTTGTATGT TTATGAAATT AAAGAGTTCT TGCTAATCAA AATCTCTATA 
CAGGTCATGG ACGCAAAACC TTTACTAAAG GAGGCTCTTC AAGCAGCAGT TGGCTTGCCT 
GTTGACAAGA AGATCCCTTT GATTGQCTTC ATCGGCAGAC TTGAGGAGCA GAAAGGTTCA 
GATATTCTTG TTGCTGCAAT TCACAAGTTC ATCGGATTGG ATGTTCAAAT TGTAGTCCTT 
GTAAGTACCA AATGGACTCA TGGTATCTCT CTTGTTGAGT TTACTTGTGC CG7UVACTGAA 
ATTGACCTGC TACTCATCCT ATGCATCAGG GAACTGGCAA AAAGGATTTT GAGCAGGAGA 
TTGAACAGCT CGAAGTGTTG TACCCTAACA AAGCTAAAGG AGTGGCAAAA TTCAATGTCC 



1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 
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CTTTGGCTCA 


CATGATCACT 


GCTGGTGCTG 


ATTTTATGTT 


GGTTCCAAGC 


AGATTTGAAC 


3420 


CTTGTGGTCT 


CATTCAGTTA 


CATGCTATGC 


GATATGGAAC 


AGTAAGAACC 


AGAAGAGCTT 


3480 


GTACCTTTTT 


ACTGAGTTTT 


TAAAAAAAGA 


ATCATAAGAC 


CTTGTTTTCC 


ATCTAAAGTT 


3540 


TAATAACCAA 


CTAAATGTTA 


CTGCAGCAAG 


CTTTTCATTT 


CTGAAAATTG 


GTTATCTGAT 


3600 


TTTAACGTAA 


TCACATGTGA 


GTCAGGTACC 


AATCTGTGCA 


TCX3ACTGGTG 


GACTTGTTGA 


3660 


CACTGTGAaA 


GAAGGCTATA 


CTGGATTCCA 


TATGGGAGCC 


TTCAATGTTG 


AAGTATGTGA 


3720 


TTTTACATCA 


ATTGTGTACT 


TGTACATGGT 


CCATTCTCGT 


CTTGATATAC 


CCCTTGTTGC 


3780 


ATAAACATTA 


ACTTATTGCT 


TCTTGAATTT 


GGTTAGTGCG 


ATGTTGTTGA 


CCCAGCTGAT 


3840 


GTGCTTAAGA 


TAGTAACAAC 


AGTTGCTAGA 


GCTCTTGCAG 


TCTATGGCAC 


CCTCGCATTT 


3900 


GCTGAGATGA 


TAAAAAATTG 


CATGTCAGAG 


GAGCTCTCCT 


GGAAGGTAAG 


TGTGAATTTG 


3960 


ATAATTTGCG 


TAGGTACTTC 


AGTTTGTTGT 


TCTCGTCAGC 


ACTGATGGAT 


TCCAACTGGT 


4020 


GTTCTTGCAG 


GAACCTGCCA 


AGAAATGGGA 


GACATTGCTA 


TTGGGCTTAG 


GAGCTTCTGG 


4080 


.^■^^^m^^^ ^ ^^^^^^ 

CAGTGAACCC 


GGTGTTGAAG 


GGGAAGAAAT 


CGCTCCACTT 


GCCAAGGAAA 


ATGTAGCCAC 


4140 


TCCTTAAATG 


AGCTTTGGTT 


ATCCTTGTTT 


CAACAATAAG 


ATCATTAAGC 


AAACGTATTT 


4200 


ACTAGCGAAC 


TATGTAGAAC 


CCTATTATGG 


GGTCTCAATC 


ATCTACAAAA 


TGATTGGTTT 


4260 


TTGCTGGGGA 


GCAGCAGCAT 


ATAAGGCTGT 


AAAATCCTGG 


TTAATGTTTT 


TGTAGGTAAG 


4320 


GGCTATTTAA 


GGTGGTGTGG 


ATCAAAGTCA 


ATAGAAAATA 


GTTATTACTA 


ACGTTTGCAA 


4380 


CTAAATACTT 


AGTAATGTAG 


CATAAATAAT 


ACTAGAACTA 


GTAGCTAATA 


TATATGCGTG 


4440 


AATTTGTTGT 


ACCTTTTCTT 


GCATAAT TAT 


TTGCAGTACA 


TATATAATGA 


AAATTACCCA 


4500 


AGGAATCAAT 


GTTTCTTGCT 


CCGTCCTCCT 


TTGATGATTT 


TTTACGCAAT 


ACAGAGCTAG 


4560 


TGTGTTATGT 


TATAAATTTT 


GTTTAAAAGA 


AGTAATCAAA 


TTCAAAT TAG 


TTGTTTGGTC 


4620 


ATATGAAAGA 


AGCTGCCAGG 


CTAACTT TG A 


GGAGATGGCT 


ATTGAATTTC 


AAAATGATTA 


4680 


TGTGAAAACA 


ATGCAACATC 


TATGTCAATC 


AACACTTAAA 


TTATTGCATT 


TAGAAAGATA 


4740 


TTTTTGAGCC 


CATGACACAT 


TCATTCATAA 


AGTAAGGTAG 


TATGTATGAT 


TGAATGGACT 


4800 


ACAGCTCAAT 


CAAAGCATCT 


CCTTTACATA 


ACGGCACTGT 


CTCTTGTCTA 


CTACTCTATT 


4860 


GGTAGTAGTA 


GTAGTAATTT 


TACAATCCAA 


ATTGAATAGT 


AATAAGATGC 


TCTCTATTTA 


4920 


CTAAAGTAGT 


AGTATTATTC 


TTTCGTTACT 


CTAAAGCAAC 


AAAA 




4964 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 
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(B) LOCATION: 1..69 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 1-207 of SEQ ID NO* 2." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Asn Lys Leu Asp Gly Leu Gin Ser Thr Thr Asn Thr Lys Val Thr Pro 
15 10 15 

Lys Met Ala Ser Arg Thr Glu Thr Lys Arg Pro Gly Cys Ser Ala Thr 
20 25 30 

lie Val Cys Gly Lys Gly Met Asn Leu lie Phe Val Gly Thr Glu Val 
35 40 45 

Gly Pro Trp Ser Lys Thr Gly Gly Leu Gly Asp Val Leu Gly Gly Leu 
50 55 60 

Pro Pro Ala Leu Ala 
65 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 1..27 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 296-377 of SEQ ID NO. 2," 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Ala Arg Gly His Arg Val Met Thr lie Ser Pro Arg Tyr Asp Gin Tyr 
15 10 15 

Lys Asp Ala Trp Asp Thr Gly Val Ala Val Glu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..33 

(D) OTHER INFORMATION; /note= "Amino acid sequence encoded 
by nucleotides 452-550 of SEQ ID NO, 2." 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Val Lys Val Gly Asp Ser II Glu lie Val Arg Phe Phe His Cvs Tvr 
1 5 10 il ^ 

Lys Arg Gly Val Asp Arg Val Phe Val Asp His Pro Met Phe Leu Glu 
20 25 30 

Lys 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..30 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 647-736 of SEQ ID NO* 2»" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Val Trp Gly Lys Thr Gly Ser Lys lie Tyr Gly Pro Lys Ala Gly Leu 
15 10 15 

Asp Tyr Leu Asp Asn Glu Leu Arg Phe Ser Leu Leu Cys Gin 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 1..21 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 815-878 of SEQ ID NO. 2." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Ala Ala Leu Glu Ala Pro Lys Val Leu Asn Leu Asn Ser Ser Asn Tyr 
15 10 15 

Phe Ser Gly Pro Tyr 
20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amin acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

( ix ) FEATXJRE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..34 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 878 and 959-1059 of SEQ ID NO. 2." 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 11: 

61y Glu Asp Val Leu Phe lie Ala Asn Asp Trp His Thr Ala Leu lie 
15 10 15 

Pro Cys Tyr Leu Lys Ser Met Tyr Gin Ser Arg Gly lie Tyr Leu Asn 
20 25 30 

Ala Lys 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: cunino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECXJLE TYPE: peptide 

( ix ) FEATURE : 

(A) NA14E/KEY: Modif ied-site 

(B) LOCATION: 1..38 

(D) OTHER INFORMATION: /note^ "Amino acid sequence encoded 
by nucleotides 1150-1263 of SEQ ID NO 2." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Ala Phe Cys lie His Asn lie Ala Tyr Gin Gly Arg Phe Ser Phe 
15 10 15 

Ser Asp Phe Pro Leu Leu Asn Leu Pro Asp Glu Phe Arg Gly Ser Phe 
20 25 30 

Asp Phe lie Asp Gly Tyr 
35 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 79 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..79 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 1349-1585 of SEQ ID NO 2." 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Lys Pro Val Lys Gly Arg Lys lie Asn Trp Met Lys Ala Gly lie Leu 
^5 10 15 

Glu Ser His Arg Val Val Thr Val Ser Pro Tyr Tyr Ala Gin Glu Leu 
20 25 30 

Val Ser Ala Val Asp Lys Gly Val Glu Leu Asp Ser Val Leu Arg Lys 
35 40 45 

Thr Cys lie Thr Gly He Val Asn Gly Met Asp Thr Gin Glu Trp Asn 
50 55 60 

Pro Ala Thr Asp Lys Tyr Thr Asp Val Lys Tyr Asp He Thr Thr 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1,.59 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 1676-1855 of SEQ ID NO 2." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Val Met Asp Ala Lys Pro Leu Leu Lys Glu Ala Leu Gin Ala Ala Val 
15 10 15 

Gly Leu Pro Val Asp Lys Lys lie Pro Leu lie Gly Phe lie Gly Ara 
20 25 30 

Leu Glu Glu Gin Lys Gly Ser Asp lie Leu Ala Val Ala lie His Lys 
35 40 45 

Phe lie Gly Leu Asp Val Gin lie Val Val Leu 
50 55 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..64 

(D) OTHER INFORMATION: /note= "Amin acid sequence encoded 
by nucleotides 1945-2136 of SEQ ID NO 2." 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Gly Thr Gly Lys Lys Glu Phe Glu Gin Glu lie Glu Gin Leu Glu Val 
IS 10 15 

Leu Tyr Pro Asn Lys Ala Lys Gly Val Ala Lys Phe Asn Val Pro Leu 
20 25 30 

Ala His Met lie Thr Ala Gly Ala Asp Phe Met Leu Val Pro Ser Arq 
35 40 45 

Phe Glu Pro Cys Gly Leu lie Gin Leu His Ala Met Arg Tyr Gly Thr 
50 55 60 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..29 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 2301-2386 of SEQ ID NO 2," 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Val Pro lie Cys Ala Ser Thr Gly Gly Leu Val Asp Thr Val Lys Glu 
15 10 15 

Gly Tyr Thr Gly Phe His Met Gly Ala Phe Asn Val Glu 
20 25 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1,.19 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 2492-2459 of SEQ ID NO 2." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Cys Asp Val Val Asp Pro Ala Asp Val Leu Lys lie Val Thr Thr Val 
15 10 15 

Ala Arg Ala 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 1..111 

(D) OTHER INFORMATION: /note= "Amino acid seqpience encoded 
by nucleotides 1200-1532 of SEQ ID NO 5." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Ala Ser lie Thr Ala Ser His His Phe Val Ser Arg Ser Gin Thr 
15 10 15 

Ser Leu Asp Thr Lys Ser Thr Leu Ser Gin lie Gly Leu Arg Asn His 
20 25 30 

Thr Leu Thr His Asn Gly Leu Arg Ala Val Asn Lys Leu Asp Gly Leu 
35 40 45 

Gin Ser Thr Thr Asn Thr Lys Val Thr Pro Lys Met Ala Ser Arg Thr 
50 55 60 

Glu Thr Lys Arg Pro Gly Cys Ser Ala Thr lie Val Cys Gly Lys Gly 
65 70 75 80 

Met Asn Leu lie Phe Val Gly Thr Glu Val Gly Pro Trp Ser Lys Thr 
85 90 95 

Gly Gly Leu Gly Asp Val Leu Gly Gly Leu Pro Pro Ala Leu Ala 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

( B ) LOCATION : 1 • . 43 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 3817-3945 of SEQ ID NO. 5." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Cys Asp Val Val Asp Pro Ala Asp Val Leu Lys lie Val Thr Thr Val 
15 10 15 

Ala Arg Ala Leu Ala Val Tyr Gly Thr Leu Ala Phe Ala Glu Met lie 
20 25 30 

Lys Asn Cys Met Ser Glu Glu Leu Ser Trp Lys 
35 40 
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(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amin acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: l.,38 

(D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
by nucleotides 4031-4144 of SEQ ID NO. 5." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Glu Pro Ala Lys Lys Trp Glu Thr Leu Leu Leu Gly Leu Gly Ala Ser 
15 10 15 

Gly Ser Glu Pro Gly Val Glu Gly Glu Glu lie Ala Pro Leu Ala Lys 
20 25 30 

Glu Asn Val Ala Thr Pro 
35 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA 

(ix) FEATURE: 

(A) NAME/KEY: misc_RNA 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /note= "Nucleotide 1 is a 7-methyl 
guanine added by 5 '-5' linkage as an RNA cap." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GAUGGCAAGA AAAAAAA 



